Overview

Dataset statistics

Number of variables13
Number of observations9646
Missing cells125
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory979.8 KiB
Average record size in memory104.0 B

Variable types

Categorical3
Numeric10

Alerts

agency has a high cardinality: 1056 distinct values High cardinality
population is highly correlated with robbery and 1 other fieldsHigh correlation
robbery is highly correlated with population and 4 other fieldsHigh correlation
assault is highly correlated with population and 4 other fieldsHigh correlation
burglary is highly correlated with robbery and 4 other fieldsHigh correlation
larceny is highly correlated with robbery and 4 other fieldsHigh correlation
auto_theft is highly correlated with burglary and 2 other fieldsHigh correlation
total is highly correlated with robbery and 4 other fieldsHigh correlation
larceny is highly correlated with auto_theft and 1 other fieldsHigh correlation
auto_theft is highly correlated with larceny and 1 other fieldsHigh correlation
total is highly correlated with larceny and 1 other fieldsHigh correlation
robbery is highly correlated with assaultHigh correlation
assault is highly correlated with robberyHigh correlation
burglary is highly correlated with larceny and 2 other fieldsHigh correlation
larceny is highly correlated with burglary and 2 other fieldsHigh correlation
auto_theft is highly correlated with burglary and 2 other fieldsHigh correlation
total is highly correlated with burglary and 2 other fieldsHigh correlation
county_name is highly correlated with populationHigh correlation
population is highly correlated with county_nameHigh correlation
murder is highly correlated with rapeHigh correlation
rape is highly correlated with murderHigh correlation
robbery is highly correlated with larceny and 2 other fieldsHigh correlation
assault is highly correlated with larceny and 1 other fieldsHigh correlation
larceny is highly correlated with robbery and 3 other fieldsHigh correlation
auto_theft is highly correlated with robbery and 2 other fieldsHigh correlation
total is highly correlated with robbery and 3 other fieldsHigh correlation
population has 125 (1.3%) missing values Missing
robbery is highly skewed (γ1 = 50.75171203) Skewed
assault is highly skewed (γ1 = 79.5400419) Skewed
larceny is highly skewed (γ1 = 47.28986587) Skewed
auto_theft is highly skewed (γ1 = 31.76598427) Skewed
total is highly skewed (γ1 = 45.38515893) Skewed
population has 1135 (11.8%) zeros Zeros
murder has 8427 (87.4%) zeros Zeros
rape has 6523 (67.6%) zeros Zeros
robbery has 5179 (53.7%) zeros Zeros
assault has 2706 (28.1%) zeros Zeros
burglary has 2492 (25.8%) zeros Zeros
larceny has 1326 (13.7%) zeros Zeros
auto_theft has 3954 (41.0%) zeros Zeros
total has 913 (9.5%) zeros Zeros

Reproduction

Analysis started2023-03-27 19:56:57.120152
Analysis finished2023-03-27 19:57:30.017158
Duration32.9 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

county_name
Categorical

HIGH CORRELATION

Distinct21
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size75.5 KiB
BERGEN
1250 
MONMOUTH
891 
MORRIS
706 
CAMDEN
689 
BURLINGTON
589 
Other values (16)
5521 

Length

Max length10
Median length6
Mean length6.972838482
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowATLANTIC
2nd rowATLANTIC
3rd rowATLANTIC
4th rowATLANTIC
5th rowATLANTIC

Common Values

ValueCountFrequency (%)
BERGEN1250
13.0%
MONMOUTH891
 
9.2%
MORRIS706
 
7.3%
CAMDEN689
 
7.1%
BURLINGTON589
 
6.1%
OCEAN589
 
6.1%
MIDDLESEX500
 
5.2%
ESSEX494
 
5.1%
GLOUCESTER468
 
4.9%
UNION432
 
4.5%
Other values (11)3038
31.5%

Length

2023-03-27T15:57:30.170457image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
bergen1250
 
12.6%
monmouth891
 
9.0%
morris706
 
7.1%
camden689
 
7.0%
burlington589
 
5.9%
ocean589
 
5.9%
middlesex500
 
5.0%
essex494
 
5.0%
gloucester468
 
4.7%
union432
 
4.4%
Other values (12)3304
33.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

year
Real number (ℝ≥0)

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2018.797636
Minimum2017
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size75.5 KiB
2023-03-27T15:57:30.446504image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum2017
5-th percentile2017
Q12018
median2019
Q32020
95-th percentile2020
Maximum2020
Range3
Interquartile range (IQR)2

Descriptive statistics

Standard deviation0.9813705944
Coefficient of variation (CV)0.0004861163777
Kurtosis-1.053998367
Mean2018.797636
Median Absolute Deviation (MAD)1
Skewness-0.2291850482
Sum19473322
Variance0.9630882435
MonotonicityIncreasing
2023-03-27T15:57:30.638525image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=4)
ValueCountFrequency (%)
20182890
30.0%
20192890
30.0%
20202890
30.0%
2017976
 
10.1%
ValueCountFrequency (%)
2017976
 
10.1%
20182890
30.0%
20192890
30.0%
20202890
30.0%
ValueCountFrequency (%)
20202890
30.0%
20192890
30.0%
20182890
30.0%
2017976
 
10.1%

agency
Categorical

HIGH CARDINALITY

Distinct1056
Distinct (%)10.9%
Missing0
Missing (%)0.0%
Memory size75.5 KiB
WASHINGTON TWP PD
 
60
FRANKLIN TWP PD
 
45
SPRINGFIELD TWP PD
 
30
MANSFIELD TWP PD
 
30
MONROE TWP PD
 
30
Other values (1051)
9451 

Length

Max length39
Median length15
Mean length16.07609372
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAbsecon
2nd rowAbsecon
3rd rowAtlantic City
4th rowAtlantic City
5th rowBrigantine

Common Values

ValueCountFrequency (%)
WASHINGTON TWP PD60
 
0.6%
FRANKLIN TWP PD45
 
0.5%
SPRINGFIELD TWP PD30
 
0.3%
MANSFIELD TWP PD30
 
0.3%
MONROE TWP PD30
 
0.3%
OCEAN TWP PD30
 
0.3%
GREENWICH TWP PD30
 
0.3%
HAMILTON TWP PD30
 
0.3%
BRIDGETON PD15
 
0.2%
CUMBERLAND CO PROSECUTOR'S OFFICE15
 
0.2%
Other values (1046)9331
96.7%

Length

2023-03-27T15:57:30.834391image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
pd7350
26.9%
twp1950
 
7.1%
boro860
 
3.2%
co630
 
2.3%
county419
 
1.5%
police407
 
1.5%
township384
 
1.4%
office360
 
1.3%
park353
 
1.3%
state330
 
1.2%
Other values (497)14243
52.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

report_type
Categorical

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size75.5 KiB
Number of Offenses
2222 
Rate Per 100,000
2222 
Number of Clearances
1734 
Percent Cleared
1734 
Number of Arrests
1734 

Length

Max length20
Median length17
Mean length17.17976363
Min length15

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNumber of Offenses
2nd rowRate Per 100,000
3rd rowNumber of Offenses
4th rowRate Per 100,000
5th rowNumber of Offenses

Common Values

ValueCountFrequency (%)
Number of Offenses2222
23.0%
Rate Per 100,0002222
23.0%
Number of Clearances1734
18.0%
Percent Cleared1734
18.0%
Number of Arrests1734
18.0%

Length

2023-03-27T15:57:31.088855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2023-03-27T15:57:31.240550image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
number5690
20.9%
of5690
20.9%
offenses2222
 
8.2%
rate2222
 
8.2%
per2222
 
8.2%
100,0002222
 
8.2%
clearances1734
 
6.4%
percent1734
 
6.4%
cleared1734
 
6.4%
arrests1734
 
6.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

population
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct965
Distinct (%)10.1%
Missing125
Missing (%)1.3%
Infinite0
Infinite (%)0.0%
Mean15751.88762
Minimum0
Maximum283673
Zeros1135
Zeros (%)11.8%
Negative0
Negative (%)0.0%
Memory size75.5 KiB
2023-03-27T15:57:31.405772image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12934
median8271
Q318874
95-th percentile55348
Maximum283673
Range283673
Interquartile range (IQR)15940

Descriptive statistics

Standard deviation24654.68159
Coefficient of variation (CV)1.565189023
Kurtosis45.91403376
Mean15751.88762
Median Absolute Deviation (MAD)6505
Skewness5.373037278
Sum149973722
Variance607853324.5
MonotonicityNot monotonic
2023-03-27T15:57:31.606115image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01135
 
11.8%
584430
 
0.3%
1160230
 
0.3%
346330
 
0.3%
767930
 
0.3%
1030830
 
0.3%
1050417
 
0.2%
1117
 
0.2%
73017
 
0.2%
1274817
 
0.2%
Other values (955)8168
84.7%
(Missing)125
 
1.3%
ValueCountFrequency (%)
01135
11.8%
57
 
0.1%
1117
 
0.2%
6815
 
0.2%
692
 
< 0.1%
18117
 
0.2%
2472
 
< 0.1%
24915
 
0.2%
27715
 
0.2%
2792
 
< 0.1%
ValueCountFrequency (%)
2836732
 
< 0.1%
28225815
0.2%
27017515
0.2%
2679062
 
< 0.1%
1476902
 
< 0.1%
14689315
0.2%
1297262
 
< 0.1%
12908015
0.2%
10291515
0.2%
1027592
 
< 0.1%

murder
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct262
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.361580343
Minimum0
Maximum232.8
Zeros8427
Zeros (%)87.4%
Negative0
Negative (%)0.0%
Memory size75.5 KiB
2023-03-27T15:57:32.091344image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile4.6
Maximum232.8
Range232.8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation13.92887355
Coefficient of variation (CV)5.898115468
Kurtosis73.8243738
Mean2.361580343
Median Absolute Deviation (MAD)0
Skewness7.962270118
Sum22779.80399
Variance194.0135183
MonotonicityNot monotonic
2023-03-27T15:57:32.333917image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
08427
87.4%
1433
 
4.5%
100125
 
1.3%
2113
 
1.2%
361
 
0.6%
424
 
0.2%
519
 
0.2%
5014
 
0.1%
613
 
0.1%
79
 
0.1%
Other values (252)408
 
4.2%
ValueCountFrequency (%)
08427
87.4%
0.97315077021
 
< 0.1%
0.97915381531
 
< 0.1%
1433
 
4.5%
1.13
 
< 0.1%
1.1271034571
 
< 0.1%
1.21
 
< 0.1%
1.31
 
< 0.1%
1.42
 
< 0.1%
1.407400111
 
< 0.1%
ValueCountFrequency (%)
232.81
 
< 0.1%
2008
 
0.1%
1501
 
< 0.1%
1331
 
< 0.1%
104.41
 
< 0.1%
100125
1.3%
861
 
< 0.1%
801
 
< 0.1%
771
 
< 0.1%
76.91
 
< 0.1%

rape
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct626
Distinct (%)6.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.746099791
Minimum0
Maximum300
Zeros6523
Zeros (%)67.6%
Negative0
Negative (%)0.0%
Memory size75.5 KiB
2023-03-27T15:57:32.580762image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile36.4
Maximum300
Range300
Interquartile range (IQR)1

Descriptive statistics

Standard deviation17.90149691
Coefficient of variation (CV)3.11541699
Kurtosis33.24681081
Mean5.746099791
Median Absolute Deviation (MAD)0
Skewness4.959139155
Sum55426.87858
Variance320.4635918
MonotonicityNot monotonic
2023-03-27T15:57:32.837308image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
06523
67.6%
1786
 
8.1%
2320
 
3.3%
3190
 
2.0%
100135
 
1.4%
4113
 
1.2%
593
 
1.0%
5072
 
0.7%
662
 
0.6%
752
 
0.5%
Other values (616)1300
 
13.5%
ValueCountFrequency (%)
06523
67.6%
1786
 
8.1%
1.4274906141
 
< 0.1%
1.53
 
< 0.1%
1.71
 
< 0.1%
1.8213940951
 
< 0.1%
1.91
 
< 0.1%
2320
 
3.3%
2.12
 
< 0.1%
2.21
 
< 0.1%
ValueCountFrequency (%)
3001
 
< 0.1%
2501
 
< 0.1%
232.81
 
< 0.1%
2001
 
< 0.1%
170.41
 
< 0.1%
1671
 
< 0.1%
165.11
 
< 0.1%
157.61
 
< 0.1%
157.11
 
< 0.1%
1503
< 0.1%

robbery
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct927
Distinct (%)9.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.04854483
Minimum0
Maximum5797.101449
Zeros5179
Zeros (%)53.7%
Negative0
Negative (%)0.0%
Memory size75.5 KiB
2023-03-27T15:57:33.135288image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q38
95-th percentile75
Maximum5797.101449
Range5797.101449
Interquartile range (IQR)8

Descriptive statistics

Standard deviation74.20257555
Coefficient of variation (CV)4.930880452
Kurtosis3833.987558
Mean15.04854483
Median Absolute Deviation (MAD)0
Skewness50.75171203
Sum145158.2635
Variance5506.022218
MonotonicityNot monotonic
2023-03-27T15:57:33.387926image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
05179
53.7%
1760
 
7.9%
2408
 
4.2%
3254
 
2.6%
4179
 
1.9%
100151
 
1.6%
5150
 
1.6%
6110
 
1.1%
50109
 
1.1%
983
 
0.9%
Other values (917)2263
23.5%
ValueCountFrequency (%)
05179
53.7%
1760
 
7.9%
1.91
 
< 0.1%
2408
 
4.2%
2.55
 
0.1%
3254
 
2.6%
3.12
 
< 0.1%
3.3002211151
 
< 0.1%
3.42
 
< 0.1%
3.5853859671
 
< 0.1%
ValueCountFrequency (%)
5797.1014491
< 0.1%
10701
< 0.1%
6891
< 0.1%
6361
< 0.1%
589.48550351
< 0.1%
588.06766661
< 0.1%
571.31
< 0.1%
551.82438531
< 0.1%
5341
< 0.1%
5321
< 0.1%

assault
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct1331
Distinct (%)13.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35.77837206
Minimum0
Maximum20000
Zeros2706
Zeros (%)28.1%
Negative0
Negative (%)0.0%
Memory size75.5 KiB
2023-03-27T15:57:33.673709image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median5
Q340.76535647
95-th percentile115
Maximum20000
Range20000
Interquartile range (IQR)40.76535647

Descriptive statistics

Standard deviation218.4232365
Coefficient of variation (CV)6.104895889
Kurtosis7239.206448
Mean35.77837206
Median Absolute Deviation (MAD)5
Skewness79.5400419
Sum345118.1768
Variance47708.71023
MonotonicityNot monotonic
2023-03-27T15:57:33.863487image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02706
28.1%
1789
 
8.2%
2521
 
5.4%
100436
 
4.5%
3342
 
3.5%
4288
 
3.0%
5226
 
2.3%
6208
 
2.2%
7163
 
1.7%
8152
 
1.6%
Other values (1321)3815
39.6%
ValueCountFrequency (%)
02706
28.1%
1789
 
8.2%
2521
 
5.4%
3342
 
3.5%
3.6739042581
 
< 0.1%
3.71
 
< 0.1%
3.81
 
< 0.1%
3.91
 
< 0.1%
4288
 
3.0%
4.31
 
< 0.1%
ValueCountFrequency (%)
200001
< 0.1%
1470.61
< 0.1%
1449.2753621
< 0.1%
1284.0011311
< 0.1%
12801
< 0.1%
1192.21
< 0.1%
11551
< 0.1%
1045.91
< 0.1%
1036.81
< 0.1%
1033.61
< 0.1%

burglary
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct1715
Distinct (%)17.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51.61983173
Minimum0
Maximum2068.302068
Zeros2492
Zeros (%)25.8%
Negative0
Negative (%)0.0%
Memory size75.5 KiB
2023-03-27T15:57:34.104397image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median7
Q338
95-th percentile258.3318401
Maximum2068.302068
Range2068.302068
Interquartile range (IQR)38

Descriptive statistics

Standard deviation128.4650459
Coefficient of variation (CV)2.488676186
Kurtosis41.86164662
Mean51.61983173
Median Absolute Deviation (MAD)7
Skewness5.420449065
Sum497924.8968
Variance16503.26803
MonotonicityNot monotonic
2023-03-27T15:57:34.312482image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02492
25.8%
1682
 
7.1%
2442
 
4.6%
3348
 
3.6%
4290
 
3.0%
5236
 
2.4%
6219
 
2.3%
8188
 
1.9%
7169
 
1.8%
11160
 
1.7%
Other values (1705)4420
45.8%
ValueCountFrequency (%)
02492
25.8%
1682
 
7.1%
2442
 
4.6%
3348
 
3.6%
4290
 
3.0%
5236
 
2.4%
6219
 
2.3%
7169
 
1.8%
7.11
 
< 0.1%
7.81
 
< 0.1%
ValueCountFrequency (%)
2068.3020681
< 0.1%
1732.2834651
< 0.1%
1715.71
< 0.1%
1629.31
< 0.1%
15871
< 0.1%
1549.4137351
< 0.1%
1360.51
< 0.1%
1351.41
< 0.1%
1323.11
< 0.1%
13001
< 0.1%

larceny
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct2362
Distinct (%)24.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean333.6074983
Minimum0
Maximum194117.6
Zeros1326
Zeros (%)13.7%
Negative0
Negative (%)0.0%
Memory size75.5 KiB
2023-03-27T15:57:34.563752image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15
median25
Q3198.15
95-th percentile1311.733834
Maximum194117.6
Range194117.6
Interquartile range (IQR)193.15

Descriptive statistics

Standard deviation3398.150131
Coefficient of variation (CV)10.1860724
Kurtosis2387.100962
Mean333.6074983
Median Absolute Deviation (MAD)25
Skewness47.28986587
Sum3217977.928
Variance11547424.31
MonotonicityNot monotonic
2023-03-27T15:57:34.802428image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01326
 
13.7%
1309
 
3.2%
2271
 
2.8%
3238
 
2.5%
5192
 
2.0%
6188
 
1.9%
4187
 
1.9%
9167
 
1.7%
7166
 
1.7%
8157
 
1.6%
Other values (2352)6445
66.8%
ValueCountFrequency (%)
01326
13.7%
1309
 
3.2%
2271
 
2.8%
3238
 
2.5%
4187
 
1.9%
5192
 
2.0%
6188
 
1.9%
7166
 
1.7%
8157
 
1.6%
9167
 
1.7%
ValueCountFrequency (%)
194117.61
< 0.1%
175362.31881
< 0.1%
145588.21
< 0.1%
129411.81
< 0.1%
200002
< 0.1%
17885.51
< 0.1%
13149.60631
< 0.1%
11633.91
< 0.1%
11526.21
< 0.1%
111521
< 0.1%

auto_theft
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct1338
Distinct (%)13.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.5754178
Minimum0
Maximum7352.9
Zeros3954
Zeros (%)41.0%
Negative0
Negative (%)0.0%
Memory size75.5 KiB
2023-03-27T15:57:35.033569image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q317
95-th percentile104.7584032
Maximum7352.9
Range7352.9
Interquartile range (IQR)17

Descriptive statistics

Standard deviation123.207139
Coefficient of variation (CV)4.817404744
Kurtosis1548.240139
Mean25.5754178
Median Absolute Deviation (MAD)1
Skewness31.76598427
Sum246700.4801
Variance15179.99911
MonotonicityNot monotonic
2023-03-27T15:57:35.299992image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
03954
41.0%
1934
 
9.7%
2470
 
4.9%
3311
 
3.2%
4269
 
2.8%
5184
 
1.9%
6167
 
1.7%
7131
 
1.4%
8117
 
1.2%
9100
 
1.0%
Other values (1328)3009
31.2%
ValueCountFrequency (%)
03954
41.0%
1934
 
9.7%
2470
 
4.9%
3311
 
3.2%
3.41
 
< 0.1%
3.91
 
< 0.1%
4269
 
2.8%
4.21
 
< 0.1%
4.2547759861
 
< 0.1%
4.51
 
< 0.1%
ValueCountFrequency (%)
7352.91
< 0.1%
4411.81
< 0.1%
2898.5507251
< 0.1%
2465.81
< 0.1%
23921
< 0.1%
20151
< 0.1%
17071
< 0.1%
1643.81
< 0.1%
1606.41
< 0.1%
15411
< 0.1%

total
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct2519
Distinct (%)26.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean446.587915
Minimum0
Maximum202941.2
Zeros913
Zeros (%)9.5%
Negative0
Negative (%)0.0%
Memory size75.5 KiB
2023-03-27T15:57:35.571975image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q110
median36
Q3292
95-th percentile1850.6
Maximum202941.2
Range202941.2
Interquartile range (IQR)282

Descriptive statistics

Standard deviation3593.996656
Coefficient of variation (CV)8.047680054
Kurtosis2264.708798
Mean446.587915
Median Absolute Deviation (MAD)35
Skewness45.38515893
Sum4307787.028
Variance12916811.96
MonotonicityNot monotonic
2023-03-27T15:57:35.805128image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0913
 
9.5%
2198
 
2.1%
1190
 
2.0%
4184
 
1.9%
6163
 
1.7%
3160
 
1.7%
13152
 
1.6%
8149
 
1.5%
14145
 
1.5%
7140
 
1.5%
Other values (2509)7252
75.2%
ValueCountFrequency (%)
0913
9.5%
1190
 
2.0%
2198
 
2.1%
3160
 
1.7%
4184
 
1.9%
5128
 
1.3%
6163
 
1.7%
7140
 
1.5%
8149
 
1.5%
9131
 
1.4%
ValueCountFrequency (%)
202941.21
< 0.1%
185507.24641
< 0.1%
1500001
< 0.1%
130882.41
< 0.1%
400001
< 0.1%
200001
< 0.1%
19793.31
< 0.1%
15039.370081
< 0.1%
13235.31
< 0.1%
12877.61
< 0.1%

Interactions

2023-03-27T15:57:26.525932image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:04.409680image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:07.445427image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:10.216080image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:12.478405image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:14.647416image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:17.179540image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:19.539131image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:21.818446image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:24.279095image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:26.833961image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:04.814988image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:07.726296image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:10.429091image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:12.705313image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:14.854264image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:17.476405image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:19.772927image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:22.011786image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:24.564177image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:27.136798image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:05.252486image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:08.064688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:10.684318image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:12.920763image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:15.028381image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:17.661647image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:19.935664image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:22.272269image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:24.754427image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:27.355269image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:05.518201image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:08.449187image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:10.900593image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:13.085381image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:15.204084image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:17.840589image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:20.137738image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:22.568956image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:24.937478image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:27.599602image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:05.894130image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:08.826954image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:11.132350image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:13.295747image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:15.441083image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:18.048022image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:20.327199image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:22.731513image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:25.151999image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:27.800497image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:06.070733image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:09.144905image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:11.316268image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:13.488551image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:16.004438image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:18.316664image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:20.557603image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:22.972283image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:25.429278image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:28.019028image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:06.289792image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:09.389045image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:11.596357image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:13.719030image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:16.296282image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:18.561702image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:20.862053image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:23.182650image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:25.718965image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:28.205403image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:06.454227image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:09.596988image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:11.767077image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:13.938686image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:16.473492image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:18.804056image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:21.181245image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:23.367177image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:25.913342image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:28.501876image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:06.673055image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:09.795726image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:12.009311image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:14.213780image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:16.736857image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:19.032287image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:21.388720image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:23.581946image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:26.120090image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:28.707068image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:07.235352image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:09.980456image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:12.235422image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:14.460526image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:16.933074image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:19.256102image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:21.616228image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:23.827473image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-03-27T15:57:26.307620image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2023-03-27T15:57:36.022305image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2023-03-27T15:57:36.305218image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2023-03-27T15:57:36.626703image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2023-03-27T15:57:36.939241image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2023-03-27T15:57:37.192490image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2023-03-27T15:57:29.049195image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-03-27T15:57:29.697598image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-03-27T15:57:29.856962image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

county_nameyearagencyreport_typepopulationmurderraperobberyassaultburglarylarcenyauto_thefttotal
0ATLANTIC2017AbseconNumber of Offenses8,261.001.002.005.003.0027.00194.008.00240.00
1ATLANTIC2017AbseconRate Per 100,0008,261.0012.1124.2160.5336.32326.842,348.3896.842,905.22
2ATLANTIC2017Atlantic CityNumber of Offenses38,601.0013.0024.00227.00161.00319.001,298.00115.002,157.00
3ATLANTIC2017Atlantic CityRate Per 100,00038,601.0033.6862.17588.07417.09826.403,362.61297.925,587.94
4ATLANTIC2017BrigantineNumber of Offenses8,976.000.000.000.002.0024.0084.002.00112.00
5ATLANTIC2017BrigantineRate Per 100,0008,976.000.000.000.0022.28267.38935.8322.281,247.77
6ATLANTIC2017BuenaNumber of Offenses4,432.002.002.003.007.0019.0090.002.00125.00
7ATLANTIC2017BuenaRate Per 100,0004,432.0045.1345.1367.69157.94428.702,030.6945.132,820.40
8ATLANTIC2017Egg Harbor CityNumber of Offenses4,182.000.001.007.007.0020.00107.009.00151.00
9ATLANTIC2017Egg Harbor CityRate Per 100,0004,182.000.0023.91167.38167.38478.242,558.58215.213,610.71

Last rows

county_nameyearagencyreport_typepopulationmurderraperobberyassaultburglarylarcenyauto_thefttotal
9636WARREN2020WASHINGTON BORO PDNumber of Offenses6,535.000.000.000.002.0015.0037.001.0055.00
9637WARREN2020WASHINGTON BORO PDRate Per 100,0006,535.000.000.000.0030.60229.50566.2015.30841.60
9638WARREN2020WASHINGTON BORO PDNumber of Clearances6,535.000.000.000.002.000.002.001.005.00
9639WARREN2020WASHINGTON BORO PDPercent Cleared6,535.000.000.000.00100.000.005.00100.009.00
9640WARREN2020WASHINGTON BORO PDNumber of Arrests6,535.000.000.000.003.000.002.000.005.00
9641WARREN2020WASHINGTON TWP PDNumber of Offenses6,434.000.000.000.001.006.0043.002.0052.00
9642WARREN2020WASHINGTON TWP PDRate Per 100,0006,434.000.000.000.0015.5093.30668.3031.10808.20
9643WARREN2020WASHINGTON TWP PDNumber of Clearances6,434.000.000.000.000.001.003.000.004.00
9644WARREN2020WASHINGTON TWP PDPercent Cleared6,434.000.000.000.000.0017.007.000.008.00
9645WARREN2020WASHINGTON TWP PDNumber of Arrests6,434.000.000.000.000.001.003.000.004.00